منابع مشابه
Explorer Bayesian Policy Reuse
A long-lived autonomous agent should be able to respond online to novel instances of tasks from a familiar domain. Acting online requires ‘fast’ responses, in terms of rapid convergence, especially when the task instance has a short duration such as in applications involving interactions with humans. These requirements can be problematic for many established methods for learning to act. In doma...
متن کاملExploration and Policy Reuse
We define Policy Reuse as a learning technique guided by past policies offering the challenge of balancing among three choices: exploitation of the ongoing learned policy, exploration of random actions, and exploration towards the past policies. In this work we introduce a new exploration strategy, π-reuse, as an intelligent bias to reuse a past policy when learning a new one. Interestingly, th...
متن کاملProbabilistic Policy Reuse
We contribute Policy Reuse as a technique to improve a reinforcement learner with guidance from past learned similar policies. Our method relies on using the past policies in a novel way as a probabilistic bias where the learner faces three choices: the exploitation of the ongoing learned policy, the exploration of random unexplored actions, and the exploitation of past policies. We introduce t...
متن کاملAdaptive Probabilistic Policy Reuse
Transfer algorithms allow the use of knowledge previously learned on related tasks to speed-up learning of the current task. Recently, many complex reinforcement learning problems have been successfully solved by efficient transfer learners. However, most of these algorithms suffer from a severe flaw: they are implicitly tuned to transfer knowledge between tasks having a given degree of similar...
متن کاملBayesian Policy
A long-lived autonomous agent should be able to respond online to novel instances of tasks from a familiar domain. Acting online requires ‘fast’ responses, in terms of rapid convergence, especially when the task instance has a short duration such as in applications involving interactions with humans. These requirements can be problematic for many established methods for learning to act. In doma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2016
ISSN: 0885-6125,1573-0565
DOI: 10.1007/s10994-016-5547-y